Enhanced language modelling with phonologically constrained morphological analysis
نویسندگان
چکیده
Phonologically constrained morphological analysis (PCMA) is the decomposition of words into their component morphemes conditioned by both orthography and pronunciation. This article describes PCMA and its application in large-vocabulary continuous speech recognition to enhance recognition performance in some tasks. Our experiments, based on the British National Corpus and the LOB Corpus for training data and WSJCAM0 for test data, show clearly that PCMA leads to smaller lexicon size, smaller language models, superior word lattices and a decrease in word error rates. PCMA seems to show most benefit in open-vocabulary tasks, where the productivity of a morph unit lexicon makes a substantial reduction in out-ofvocabulary rates.
منابع مشابه
An Integrated System For Morphological Analysis Of The Slovene Language
The paper presents an integrated environment for morphological analysis of word-forms of the Slovene language. The system consists of a lexicon input and maintenance module, a lexicon output module for accessing lexical word forms, a two-level rule compiler and a two-level morphological analysis/synthesis unit. The basic paradigms and lexical alternations of word forms are handled by the lexico...
متن کاملCognate status and cross-script translation priming.
Greek-French bilinguals were tested in three masked priming experiments with Greek primes and French targets. Related primes were the translation equivalents of target words, morphologically related to targets, or phonologically related to targets. In Experiment 1, cognate translation equivalents (phonologically similar translations) showed facilitatory priming, relative to matched phonological...
متن کاملپارس مورف: تحلیلگر صرفی زبان فارسی
In this paper, the theoretical foundation, the way of implementation and the uses of Pars Morph, a Persian morphological analyzer is introduced. Pars Morph is a rule-based Persian morphological analysis system, which analyzes the internal structure of word in Persian and determines the grammatical category and function of the word parts. Pars Morph being in link with a lexicon covering about 45...
متن کاملA Comparative Study of English and Persian Advertising Slogans: Linguistic Means through the Sands of Time
This study was a contrastive analysis of the evolution of English and Persian advertising slogans to investigate their similarities/differences in using rhetorical figures, and the evolution in the use of these figures in the slogans of each language. Thus, 800 Persian and English slogans from the last four decades were collected. Lapsanka's framework (2006) including different aspects with som...
متن کاملA Morphological Parser For Afrikaans
The paper presents an integrated environment for morphological analysis of word-forms of the Slovene language. The system consists of a lexicon input and maintenance module, a lexicon output module for accessing lexical word forms, a two-level rule compiler and a two-level morphological analysis/synthesis unit. The basic paradigms and lexical alternations of word forms are handled by the lexico...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000